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Abstract 

In this article, we derive an explicit formula for computing confidence interval for the mean 
of a bounded random variable. Moreover, we have developed multistage point estimation 
methods for estimating the mean value with prescribed precision and confidence level based 
on the proposed confidence interval. 

1 Introduction 

In many areas of sciences and engineering, it is a frequent problem to estimate the mean of 
a bounded random variable. Conventional technique for constructing confidence interval relies 
on the Central Limit Theorem. However, for small and moderate sample size, using normal 
approximation can lead to serious under-coverage of the mean. In the case of bounded random 
variables, even the sample size is very large, the error can also be intolerable when the parent 
distribution is highly skewed toward extremes. 

In this article, by applying an inequality obtained by Massart 1990 and Hoeffding's probability 
inequality, we have derived an explicit formula for interval estimation of the mean in the bounded 
case. The formula is extremely simple. Moreover, we have proposed multistage estimation meth- 
ods for estimating the mean value with prescribed precision and confidence level based on the 
construction of confidence interval. 



2 Explicit Formula 

Since any random variable X bounded in interval [a, b] (i.e., Pr{a < X < b} = 1) has a linear 
relation with random variable Z = ¥~ a , it suffices to consider interval estimation for the mean 
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of random variable Z on interval [0, 1] (i.e., Pr{0 < Z < 1} = 1) and employ transformation 
X = (b — a)Z + a to obtain an estimation for the mean of X. The following Theorem 1 provides 
an easy method for constructing confidence interval for the mean of Z. 



Theorem 1 Let 5 G (0, 1) and c 



2 In; 



Let Pr{0 < Z < 1} = 1 and fj, = E(Z). Let Z 
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where n is the sample size and Zi, i = 1, • • • , n are i.i.d. observations of Z . Define 

L = Z + 



u = z + 



4 + nc 
3 

4 + nc 



1 - 2Z - \ 1 + ncZ(l - Z) 



1 - 2Z + a/1 + ncZ(l - Z) 



Then, 



Pr{L <h<U}>1-6. 
To prove Theorem 1, we need some preliminary lemmas. 
Lemma 1 Let a 



^. Let <t < 1. T/ien e(i) = 3a ( 1 ^^^Z^^ 4 "^ 1 - > satisfies equation 



exp 



ne 



2( t + |)(l_t_|) 
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■uraf/i respect to e. 



Proof. Let g = i + | where e satisfies equation ([I]) . Then q satisfies equation exp 
|, which can be simplified as 



9n(q-t) 2 
' 2q(l-q) 



(q - ty + aq(q - 1) 







with two real roots q 



2t+a±y/a 2 +4at(l-t) 
2(l+a) 



. Making use of the relation between e and q, we find 



, f ,. m 3a{l-2t)+3yJ a 2 +iat{l-t) , 3a(l-2t)-3 Va 2 +4a*(l-t) T . 

the roots of equation yj as ei = 2(i+a) a e2 = n ' 1 ^ • it can 



2(l+a) 



be verified that |a(l - 2i)| 2 < a 2 + 4ai(l - i), which leads to e(t) = e\ > and e 2 < 0. 



□ 



Lemma 2 Xei i G (0, 1). TTien e(i) is a concave function with respect to t. 

Proof. By equation ([2]), we have < t < q < 1 and ^| = 2 ( -t)+a?2 -l) = \u- 1 ' Conse- 

g t 

quently, Arli > o <=> ( g - i) - (* - - 1) > <=► g - f > sg^gj^ ■ Moreover, 
— ~~ 34rl = t ~ 3a ; tt! 1. — . Therefore, to show < 0, it suffices to show inequal- 



dt 2 dt 



1 , , 



7 dt 



ity ? - < > 2 ( t? -t)+a(2g-i) > whlch ls equivalent to 1 > 2 ( q -t)*+a( q -t)(2 g -i) smce 9 " * > 0. Note 
that °(«-|)(l-2g) _ «ft-|)(l- 2 9) _ MX 1 - 2 ?) because a satisfies eniintioTi 

tiiat 2(q-t) 2 +a(q-t)(2q-l) ~ 2(q-t) 2 +2aq(q-l)+aq-at(2q-l) ~ q-t{2q-\) beCaUSe Q SatlStieS equation 
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d2|). It follows that, to show d J^P < 0, it suffices to show inequality I > g_^2g-i)' • Invoking 
inequality < t < q < 1, we can show that q — t(2q — 1) > 0, which leads to equivalent relations 

1 > ^q-m-i) 1 - *( 2 9 -!)>(*- - 2 <?) > ~\- The last inequality is 

trivially true. □ 



Lemma 3 Let (3 = ±. Let t(z) =z + 3/3(1 2z) 1^5^ z) where < z < 1. T/ien z - i(z) = 
e(i(z)) and i(z) < z. 

Proof. Let p = t+^- where t satisfies z—t = e(t). It follows that e(t) = ~ 3 ^~^- > and t+^P- = p. 

~ A 2p(h^p)~) = 2' which 

can be simplified as (p — z) 2 + — 1) = with two roots p = 2z+ ^^ 2 {i+p) Z ^~ ~ " taking use 
of the relation between p and t, we find the solution of equation z — t = e(t) with respect to t 
as t\ = z + 3/3(1 — ^jr^m^^ ^ an d £2 = z + 2z ) ^(^^^^ ^ . It can be shown that 
- 2z)\ 2 < P 2 + 4/3z(l - z), which leads to h > z and t 2 < z. So the proof is completed by 
noting that t{z) =t 2 - □ 



Lemma 4 Let < \i < 1 and < z < 1. Then z — /U > e(/i) ift(z) > /i. 

Proof. Let i(z) > /i > 0. By Lemma 3, we have z — t(z) > and thus z — fi > z — t(z) > 0. 
We claim that z — > 0. If this is not true, then z = [i and t(z) > z > 0. By Lemma 3, we have 
t(z) = z > 0. On the other hand, i(z) = z results in z = 0. Thus we arrive at contradiction > 0. 
So we have shown z — f/, > and it follows that < < 1. We next show that z — f/, > e(/i). 

Suppose for the purpose of contradiction that z — fx < e(/x). Then 

, , , , z — t(z) , .z — t(z) {_ z — t(z)\ . . 
z - * z) = z - M ^ < e m — +1 — e(« • 

z — /i z — /X \ Z — [I J 

By Lemma 2, e(i) is concave with respect to t, hence e(/i) z J_ + (1 — z ) e ( z ) < e (^( z ))> which 
yields z — i(z) < e(i(z)). Recall Lemma 3, z — i(z) = e(i(z)). It follows that e(i(z)) < e(i(z)), 
which is a contradiction. 

□ 

We are now in the position to prove Theorem 1. By Theorem 1 of Hoeffding 1963, 

Pr{ Z>, + £} <{(^y + *(_Iz^) '-"-•}" tt 6 ( M -„). (3) 

By Lemma 1 of Massart 1990, 

( , + e)ln (^) +(1 _,_ e)ln (l^)>__|__ Ve6 (0,l-„). (4) 
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It follows from © and © that 



ne 2 



P^>^^}<^[- 2(p + mi _ ii _ i) ) V £ >0. (5) 

By the definition of t(.), we can verify that L = t(Z). Thus Pr{L > fj,} = Pr{t(Z) > fi}. Applying 
Lemma 4, we have Pr{i(Z) > /i} < Pr{Z — fj,> e(fi)}. Hence by ([5]) and Lemma 1, 



Pr{L > /x} < Pr{Z - fi > e(»} < exp 



n[e(/,)] 2 \ _ 6 



2 ( yU +%i)(i- M -^M); 2- 



Since Pr{L > /u} < | has been shown, applying this conclusion to random variable 1 — Z, we 
have Pr{U < //} < f . 

Finally, by applying Bonferrnoni's inequality, we have 

Pr{L < // < [/} > Pr{L < ^} + Pr{£7 > //} - 1 

= 1 - Pr{L > /j,} + 1 - Pr{C7 < //} - 1 

> 1-^ + 1-^-1 = 1-5. 
2 2 

3 Applications in Multistage Point Estimation 

We would like to note that the simple interval estimation method described above can be used 
to construct multistage sampling plans for estimating the mean value of a bounded variable with 
prescribed precision and confidence level. To illustrate such applications, we shall first present 
some general results of multistage point estimation based on confidence intervals. 

Let X be a random variable parameterized by 9, which is not necessary bounded. Let 
Xi,X2, ■ ■ ■ be a sequence of random samples of X. The goal is to estimate 9 via a multistage 
sampling plan with the following structure. The sampling process is divided into s stages, where 
s can be infinity or a positive integer. The continuation or termination of sampling is determined 
by decision variables. For each stage with index £, a decision variable = ^(Xi,-- - ,X ne ) 
is defined based on samples X\ , X m , where ri£ is the number of samples available at the 
£-th stage. It should be noted that ri£ can be a random number, depending on specific sampling 
schemes. The decision variable Di assumes only two possible values 0, 1 with the notion that the 
sampling is continued until Dg = 1 for some I. For the £-th stage, an estimator for 9 is defined 
based on samples X±, ■ ■ ■ , X ne . Let I denote the index of stage when the sampling is terminated. 
Then, the point estimator for 9, denoted by 0, is equal to Oi. The decision variables Di can be 
defined in terms of estimators 8i and confidence intervals (L£,Ui), where the lower confidence 
limit Li and upper confidence limit Ui are functions of X±, • • • , X ne for I = 1, • • • , s. Depending 
on various error criterion, we have different sampling plans as follows. 

Theorem 2 Let e > 0, C > an d S G (0, 1). For £ = 1, ■ ■ ■ ,s, let (L^, Ug) be a confidence interval 
such that Pr{L£ < 9 < Ui} > 1 — (5. Suppose the stopping rule is that sampling is continued until 
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Ue — e < 6i < Li + e at some stage with index £. Then, Pr{|0 — 9\ < e} > 1 — 5 provided that 
s( < 1 and that Pt{U s - e < G s < L s + e} = 1. 

We would like to note that, for estimating the mean value of a random variable bounded in 
[a, b], Theorem 2 can be applied based on the following choice: 

(i) The sample sizes of the sampling plan are chosen as deterministic integers n± < ■ ■ ■ < n s 
such that n s > In f • 

(ii) The confidence intervals are constructed by virtue of Theorem 1. 

Theorem 3 Let e > 0, C > an d $ £ (0, 1)- For £ = 1, • • • ,s, let (Li, Ui) be a confidence interval 
such that Pr{Li < 9 < U$} > 1 — Q5. Suppose the stopping rule is that sampling is continued 
until [1 — sgn(^) e]Ug < On < [1 + sgn(0^) £]Lg at some stage with index £. Then, Pr{|0 — 9\ < 
e\9\} > 1 - 5 provided that s( < 1 and that Pr{[l - sgn(0 s ) £]U S < 9 S < [1 + sgn(0 s ) £}L S } = 1, 
where sgn(x) is the sign function which assumes values 1, and —1 for x > 0, x = and x < 
respectively. 

We would like to note that, for estimating the mean value of a random variable bounded in 
[0, 1], we can use Theorems 1 and 3 based on multistage inverse sampling. 

Theorem 4 Let < d < 1, £ a > 0, £ r > and ( > 0. For £ = 1, ••• ,s, let (L e ,Ui) be 
a confidence interval such that Pr{Z^ < 9 < Ui} > 1 — (5. Suppose the stopping rule is that 
sampling is continued until Ui — max(e a , sgn(0£) £ r Ui) < Qi < Li + max(e a , sgn(0^) £ r Li) at 
some stage with index I. Then, Pr j 6 — 9 < e a or 9-6 < e r \9\\ > 1 — S provided that s( < 1 and 
that Pr{U s - max(e a , sgn(0 s ) £ r U s ) <6 S < L s + max(e a , sgn(0 s ) £ r L s )} = 1. 

For estimating the mean value of a random variable bounded in [a, 6], Theorem 4 can be 
applied based on the following choice: 

(i) The sample sizes of the sampling plan are chosen as deterministic integers ni < • • • < n s 
such that n s > ^^lnf . 

(ii) The confidence intervals are constructed by virtue of Theorem 1. 

In Theorems 2-4, the number of stages, s, is assumed to be a finite integer. In some situations, 
a sampling plan with a finite number of stages is impossible to guarantee the prescribed precision 
and confidence level. In this regard, the following theorems are useful. 

Theorem 5 Let £ > 0, C > an d $ £ (0)1)- Let r be a positive integer. Let (Li,Ui) be a 
confidence interval such that Pr{L^ < 9 < Ui} > 1 — (,5 for £ < r and that Pr{L^ < 9 < 
Ui} > 1 — (52 T ~ i for £ > r. Suppose the stopping rule is that sampling is continued until 
Ui — £ < 6i < Li + £ at some stage with index £. Then, Pr{|0 — 9\ < e} > 1 — 5 provided that 
(r + 1)C < 1 and that Pr{Z < oo} = 1. 
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Theorem 6 Let e > 0, £ > and 5 G (0,1). Let r be a positive integer. Let (Lg,Ue) be a 
confidence interval such that Pr{L^ < < Ug} > 1 — (,5 for £ < r and t/iat Pr{L^ < 6 < 
Uf} > 1 — (52 T ~ i for £ > r. Suppose the stopping rule is that sampling is continued until 
[1 — sgn(£^) e]L^ < 6g < [1 + sgn(6^) e]L^ at some stage with index £. Then, Pr{|# — 9\ < e\9\} > 
1 — 5 provided that (r + 1)£ < 1 and that Pr{Z < oo} = 1. 

Theorem 7 Let < 5 < 1, e a > 0, e r > and £ > 0. Let t be a positive integer. Let 
(Li,Ui) be a confidence interval such that Pr{L^ < < U{\ > 1 — £<5 /or £ < t and that 
Pr{L^ < 6 < U?} > 1 — (52 T ~ e for i > r. Suppose the stopping rule is that sampling is continued 
until Ui — max(e a , sgn(0^) e r Ui) < Oi < Lg + max(e a , sgn(#£) £ r Z/£) a£ some stage with index £. 
Then, Pi {§-9 < e a or 6-6 < e r |0|j > 1-5 provided that (r+l)C < 1 and thatPr{l < oo} = 1. 

We would like to note that, for estimating the mean value of a random variable bounded in 
[a, b], Theorems 5-7 can be used since it can be shown that Pr{Z < oo} = 1 as a consequence of 
using the confidence interval described by Theorem 1. 
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